Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade to Arrow 4.0.0 #198

Merged
merged 7 commits into from
Jun 17, 2021
Merged

Upgrade to Arrow 4.0.0 #198

merged 7 commits into from
Jun 17, 2021

Conversation

GPSnoopy
Copy link
Contributor

No description provided.

Fix changed key_value_metadata.h path.
Fix LogicalType Unknown enum renamed to Undefined.
Fix changed GetKey() signature override.
Remove ReadBatchSpaced deprecated functions.
@jgiannuzzi jgiannuzzi changed the title Point at Arrow-4.0.0 vcpkg PR from Ian Cook. Upgrade to Arrow 4.0.0 Jun 17, 2021
Copy link
Member

@jgiannuzzi jgiannuzzi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @GPSnoopy!

LGTM 👍

Some clarifications below for posterity 😄

${ParquetCpp_LIBRARIES}
${Arrow_LIBRARIES}
${Boost_LIBRARIES}
${Brotli_LIBRARIES}
unofficial::brotli::brotlidec-static unofficial::brotli::brotlienc-static unofficial::brotli::brotlicommon-static
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are now using vcpkg's brotli.

BZip2::BZip2
${Crypto_LIBRARIES}
double-conversion::double-conversion
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arrow 4.0.0 has its own double-conversion.

glog::glog
lz4::lz4
${SSL_LIBRARIES}
re2::re2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arrow 4.0.0 now requires re2.

@@ -76,7 +77,7 @@ extern "C"

PARQUETSHARP_EXPORT void KeyValueMetadata_Free_Entries(const std::shared_ptr<const KeyValueMetadata>* key_value_metadata, const char** keys, const char** values)
{
int64_t size = (*key_value_metadata)->size();
const int64_t size = (*key_value_metadata)->size();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not linked to the Arrow 4.0.0 upgrade, but it's better that way.

Comment on lines -114 to -118
PARQUETSHARP_EXPORT ExceptionInfo* LogicalType_Unknown(const std::shared_ptr<const LogicalType>** logical_type)
{
TRYCATCH(*logical_type = new std::shared_ptr<const LogicalType>(LogicalType::Unknown());)
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UNKNOWN is now UNDEFINED in Arrow 4.0.0 and it's not a real logical type.

std::string GetKey(const std::string& key_metadata) const override
std::string GetKey(const std::string& key_metadata) override
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Method signature changed in Arrow 4.0.0.

Comment on lines -32 to -57
PARQUETSHARP_EXPORT ExceptionInfo* TypedColumnReader_ReadBatchSpaced_##ParquetType( \
std::shared_ptr<ColumnReader>* columnReader, \
int64_t batch_size, \
int16_t* def_levels, \
int16_t* rep_levels, \
NativeType* values, \
uint8_t* valid_bits, \
int64_t valid_bits_offset, \
int64_t* levels_read, \
int64_t* values_read, \
int64_t* null_count, \
int64_t* return_value) \
{ \
TRYCATCH( \
*levels_read = static_cast<ParquetType##Reader&>(**columnReader).ReadBatchSpaced( \
batch_size, \
def_levels, \
rep_levels, \
values, \
valid_bits, \
valid_bits_offset, \
levels_read, \
values_read, \
null_count);) \
} \
\
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ReadBatchSpaced has been deprecated in Arrow 4.0.0 because it was buggy and no one was using it.
We are simply removing it altogether, assuming no one was using it in ParquetSharp either.

@@ -1 +1 @@
https://github.com/microsoft/vcpkg.git 2021.04.30
https://github.com/microsoft/vcpkg.git 4dc864e2401f2ed3230c9042a4dd56f6d1c30360
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the commit on master right after microsoft/vcpkg#17975 was merged.

@GPSnoopy GPSnoopy merged commit 653ec55 into master Jun 17, 2021
@GPSnoopy GPSnoopy deleted the Arrow-4.0.0 branch June 17, 2021 11:23
@jgiannuzzi jgiannuzzi linked an issue Jun 17, 2021 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Arrow 4.0.0
2 participants